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ABSTRACT 

The model is a “generalized switch", serving multiple traf¬ 
fic flows in discrete time. The switch uses MaxWeight algo¬ 
rithm to make a service decision (scheduling choice) at each 
time step, which determines the probability distribution of 
the amount of service that will be provided. We are primar¬ 
ily motivated by the following question: in the heavy traffic 
regime, when the switch load approaches critical level, will 
the service processes provided to each flow remain “smooth" 
(i.e., without large gaps in service)? Addressing this question 
reduces to the analysis of the asymptotic behavior of the un¬ 
sealed queue-differential process in heavy traffic. We prove 
that the stationary regime of this process converges to that of 
a positive recurrent Markov chain, whose structure we explic¬ 
itly describe. This in turn implies asymptotic “smoothness" of 
the service processes. 
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1. INTRODUCTION 

Suppose we have a system in which several data traffic flows 
share a common transmission medium (or channel). Sharing 
means that in each time slot a scheduler chooses a transmis¬ 
sion mode - the subset the flows to serve and correspond¬ 
ing transmission rates; the outcome of each transmission (the 
number of successfully delivered packets) is random. Sched¬ 
uler has two key objectives: (a) the time-average (success¬ 
ful) transmission rate of each flow i has to be at least some 
A i > 0; (b) the successful transmissions for each flow need 
to be spread out "smoothly" in time - without large time-gaps 
between succesful transmissions. Such models arise, for ex¬ 
ample, when the goal is timely delivery of information over a 
shared wireless channel [5]. 

A very natural way to approach this problem is to treat the 


model as a queueing system, where services (transmissions) 
are controlled by a so called MaxWeight scheduler (see [3,9, 
10]), which serves a set of virtual queues (one for each traffic 
flow), each receiving new work at the rate A,. (See e.g. [1].) 
This automatically achieves objective (a), if this is feasible at 
all; MaxWeight is known to be throughput optimal - stabilize 
the queues if this is feasible at all. The MaxWeight stability re¬ 
sults, however, do not tell whether or not the objective (b) is 
achieved. Specifically, when the system is heavily loaded, i.e. 
the vector A = (A;) is within the system rate region V, but 
close to its boundary, the steady-state queue lengths under 
MaxWeight are necessarily large, and it is conceivable that 
this may result in large time-gaps in service for individual 
flows. (Note that, if (a) and (b) are the objectives and the 
queues are virtual, the large queue lengths in themselves are 
not an issue. As long as (a) and (b) are achieved, minimizing 
the queue lengths is not important.) Our main results show 
that this is not the case. Namely, in the heavy traffic regime, 
when A —> A*, where A* is a point on the outer boundary of 
rate region V, the service process remains "smooth", in the 
sense that its stationary regime converges to that of a positive 
recurrent Markov chain, whose structure is given explicitly. 

To obtain "clean" convergence results, we assume that the 
amount of new work arriving in the queues in each time slot 
is random and has continuous distribution. (The amounts of 
service are random, but discrete.) Under this assumption, the 
state spaces of the processes that we consider are continuous. 
On one hand, this makes the analysis more involved (because 
the notion of positive recurrence is more involved for a con¬ 
tinuous state space, as opposed to a countable one). But on 
the other hand, this makes all stationary distributions abso¬ 
lutely continuous w.r.t. the corresponding Lebesgue measure, 
making it easier to prove convergence. We emphasize that the 
assumption of continuous distribution of the arriving work is 
non-restrictive; if we create virtual queues, artificially, for the 
purpose of applying MaxWeight algorithm, the structure of 
the virtual arrival process is within our control. 

The problem essentially reduces to analysis of stationary ver¬ 
sions of the queue-differential process Y , which is the projec¬ 
tion of the (weighted) queue length process on the subspace 
orthogonal to the outer normal cone v to the rate region 

V at the point A*. As we show, in the heavy-traffic limit, 
in steady-state, the values of the queue-differential process 

Y uniquely determine the decisions chosen by MaxWeight 
scheduler. Note that the process Y is obtained by projection 
only, without any scaling depending on the system load. 


The model that we consider is essentially a "generalized switch" 
of [9]. Some features of our model, namely random service 
outcome and continuous amounts of arriving work, as well as 
the objective (b), are motivated by applications such as timely 
delivery of packets of multiple flows over a shared wireless 
channel [5]. The model of [5] is a special case of ours; pa¬ 
per [5] introduces a debt scheme and proves that it achieves 
the throughput objective (a); the objective (b) is not consid¬ 
ered in [5]. 

The analysis of MaxWeight stability has a long history, starting 
from the seminal paper [10], which introduced 
MaxWeight; heavy traffic analysis of the algorithm originated 
in [9]. (See, e.g., [3] for an extensive recent review of 
MaxWeight literature.) 

The line of work most closely related to this paper, is that 
in [3,6, 7]. Paper [3] studies MaxWeight under heavy traffic 
regime and under the additional assumption that the normal 
cone v is one-dimensional, i.e. it is a ray. (The latter as¬ 
sumption is usually referred to in the literature as complete 
resource pooling (CRP).) Paper [3] shows, in particular, the 
stationary distribution tightness of what we call the queue- 
differential process Y in heavy traffic. Part of our analysis is 
also showing the stationary distribution tightness of Y - it is 
analogous to that in [3] (and we also borrow a lot of nota¬ 
tion from [3]). Besides the difference in models, our proof of 
tightness is more general in that it applies to non-CRP case - 
this more general argument is close to that used in [2]. From 
the tightness of stationary distributions, using the structure of 
the corresponding continuous state space, we obtain the con¬ 
vergence of the stationary version of (non-Markov) process Y 
to that of a positive recurrent Markov chain, whose structure 
we explicitly describe. 

Papers [6,7] consider objective (b) in the heavy traffic 
regime. They introduce a modification of MaxWeight, called 
regular service guarantee (RSG) scheme, which explicitly tracks 
the service time-gaps for each flow to dynamically increase 
the scheduling priority of flows with large current time-gaps. 

The papers prove that RSG, under certain parameter settings, 
preserves heavy-traffic queue-length minimization properties 
of MaxWeight, under CRP condition; at the same time, the pa¬ 
pers demonstrate via simulations that RSG improves smooth¬ 
ness (regularity) of the service process. Recall that in this 
paper we focus on the "pure" MaxWeight, without CRP, and 
formally show the service process smoothness in the heavy 
traffic limit. 

The rest of the paper is organized as follows. The formal 
model is presented in Section 2. Section 3 describes the MaxWeight 
algorithm and the heavy traffic asymptotic regime. Our main 
results, Theorems 2 and 21 are described in Section 4. (For¬ 
mal statement of Theorem 21 is in Section 9.) The CRP condi¬ 
tion is defined in Section 5. In Section 6 we provide some nec¬ 
essary background and results for general state-space Markov 
chains. Sections 7-9 we prove our results for the special 
case when CRP holds. Finally, in Section 10 we show how the 
proofs generalize to the case when CRP does not necessarily 
hold. 


Elements of a Euclidean space R^ will be viewed as row- 
vectors, and written in bold font; ||a|| is the usual Euclidean 
norm of vector a. For two vectors a and b, a ■ b denotes 
their scalar (dot) product; vector inequalities are understood 
componentwise; zero vector and the vector of all ones are 
denoted 0 and 1, respectively; ab will denote the vector ob¬ 
tained by componentwise multiplication; if all components of 
b are non-zero, ^ will denote the vector obtained by compo¬ 
nentwise division; statement “a is a positive vector" means 
a > 0. The closed ball of radius r centered at x is B r (x). The 
positive orthant of R^ is denoted R+ = {x 6 K N : x > 0}. 

For numbers a and b, we denote a V b = max(a, 6), a A b = 
min(a, 6), a + = a V 0. For vectors a < b, we denote by [a, b] 
the rectangle x^L^a;, bi\ in R^. 

We always consider Borel a-algebra ^(R^) (resp. B(R+) ) 
on R^ (resp. S(R+)), when the latter is viewed as measur¬ 
able space. Lebesgue measure on R^ is denoted by C. When 
we consider a linear subspace of R^, we endow it with the 
Euclidean metric and the corresponding Borel a -algebra and 
Lebesgue measure. 

For a random process W(t), t = 0,1, 2,..., we often use 
notation W(-) or simply W. 

2. SYSTEM MODEL 

We consider a system of N flows served by a “switch", which 
evolves in discrete time t = 0,1,.... At the beginning of each 
time-slot, the scheduler has to choose from a finite number K 
of “service-decisions". If the service decision k € {1,..., K} 
is chosen, then independently of the past history the flows 
get an amount of service, given by a random non-negative 
vector. Furthermore, we assume that (if decision k is cho¬ 
sen), there is a finite number Ok of possible service-vector 
outcomes, i.e. with probability p k ’\j = 1,..., Ok, it is given 
by a non-negative vector v k,j = (v k,j ,... ,v^). The expected 
service vector for decision k is denoted p k = (p k , ..., p%) = 
Ylf=i v k ' :l p k ' :l . We assume that vectors p, k are non-zero and 
different from each other; and that for each i there exists k 
such that p k > 0. We will use notations 

Si =ma xv i ,J over all k and j ; S = (oi ,...,S N ) 

We denote by S(t) = (Si(t ),..., S'jv(t)) the (random) real¬ 
ization of the service vector at time t, and call S(-) the service 
process. 

After the service at time t is completed, a random amount of 
work arrives into the queues, and it is given by a non-negative 
vector A(t ) = ..., A N (t)). The values of A(t ) are 

i.i.d. across times t, and A(-) is called the arrival process. 
The mean arrival rates of this process are given by vector 

A = (Ai,...,A N ) = EA(t). 

We will now make assumptions on the distribution of A(t). 
The distribution is absolutely continuous w.r.t. Lebesgue mea¬ 
sure, it is concentrated on the rectangle [0, A max ] for some 
constant vector A max > S ,max ; moreover, on this rectan¬ 
gle the distribution density f{x) is both upper and lower 
bounded by positive constants, i.e. 0 < <5* < f(x) < 5*. 


1.1 Basic notation 


If Q(t) = ..., Q N (t)) is the vector of queue lengths at 



time t, then for each i = 1,..., iV 

Qi(t + 1) = (Qi(t) - Si(t))+ + Ai(t), 

= Q i (t)+A i (t)-S i {t) + U i (t), (1) 

where Ui(t) = ( Si(t ) — Qi(t)) + is the amount of service 
“wasted" by flow i at time t. 

3. MAXWEIGHT SCHEDULING SCHEME. 
HEAVY TRAFFIC REGIME 

3.1 MaxWeight definition 

Let a vector 7 = ( 71 ,..., 7 n) > 0 be fixed. MaxWeight 
scheduling algorithm chooses, at each time t, a service de¬ 
cision 

k £ arg max (( 7 Q(t)) ■ m ); (2) 

with ties broken according to any well defined rule. 

Under MaxWeight, the queue length process Q(-) is a discrete 
time Markov chain with (continuous) state space R+ . System 
stability is understood as positive Harris recurrence of this 
Markov chain. 

Denote the system rate region by 

V = < x € R+ : x < ^ ipkH k for some 'ipk > 0, ^ ipk = 1 > 

l k k ) 

(3) 

It is well known (see [3,9,10]) that, in general, under 
MaxWeight the system is stable as long as the vector of mean 
arrival rates A is such that A < x £ V. (Scheduling rules hav¬ 
ing this property are sometimes called “throughput-optimal".) 
This is true for our model as well as will be shown in Sec¬ 
tion 7. (Establishing this fact is not difficult, but it does not 
directly follow from previous work, because we have contin¬ 
uous state space.) 

3.2 Heavy traffic regime 

We will consider a sequence of systems, indexed by n —» oo, 
operating under MaxWeight scheduling. (Variables pertaining 
to n-th system will be supplied superscript (n).) The switch 
parameters will remain unchanged, but the distribution of 
A^(t) changes with n: namely, for each n it has density 
which satisfies all conditions specified in Section 2, and /^ 
uniformly converges to some density f*. Note that, automat¬ 
ically, the limiting density /* (as well as each /^ n) ) satisfies 
bounds 0 < <5* < f* (x) < 5 * in the rectangle [0, A max ], and 
is zero elsewhere. The arrival process A* (•), such that the dis¬ 
tribution of A*(t) has density /*, has the arrival rate vector 
A*. Correspondingly, A ^ — > A*. 

We assume that A* > 0 is a maximal element of rate region 
V, i.e. x > X* and x £ V only when x = A*. Thus, A* lies 
on the outer boundary of V. We further assume that for each 
n, A*-”) lies in the interior of V; therefore, the system is stable 
for each n (under the MaxWeight algorithm). 

The (limiting) system, with arrival process A*(-) is called crit¬ 
ically loaded. 


4. MAIN RESULTS 

Consider the sequence of systems described in Section 3, in 
the heavy traffic regime. Under any throughput-optimal schedul¬ 
ing algorithm, for each n, the steady-state average amount of 
service provided to each flow i is greater or equal to its arrival 
rate A i. (It may, and typically will, be greater if the wasted 
service is taken into account.) 

We now define the notion of asymptotic smoothness of the 
steady-state service process. Informally, it means the prop¬ 
erty that as the system load approaches critical, the steady 
state service processes are such that for each flow the proba¬ 
bility of a T-long gap (without any service at all) uniformly 
vanishes, as T —> oo. 

For each n, consider the cumulative service process G 1 " ' 1 (•) in 
steady state. Namely, 

t 

G (n) (t) 4^S (n) ( r ), t = 1, 2,... 

T= 1 

Definition 1. We call the service process asymptotically 
smooth, if 

max^lim ^lim sup P (T) s= 0^ =0. (4) 

Our key result (Theorem 21 in Section 9) shows that a "queue- 
differential" process, which determines scheduling decisions 
in the system under MaxWeight in heavy traffic, is such that 
its stationary version converges to that of stationary positive 
Harris recurrent Markov chain, whose structure we describe 
explicitly. This result, in particular, will imply the following 

Theorem 2. Consider the sequence of systems described in 
Section 3, in the heavy traffic regime. Under MaxWeight schedul¬ 
ing, the service process is asymptotically smooth. 

The proof is given in Section 9. 

5. COMPLETE RESOURCE POOLING 
CONDITION 

To improve exposition, we first give detailed proofs of our 
main results for the special case, when the following complete 
resource pooling (CRP) condition holds. (In Section 10 we 
will show how the proof generalizes to the case without the 
CRP condition.) Assume that vector A* is such that there is 
a unique (up to scaling) outer normal vector v > 0 to V at 
point A*; we choose v so that Ill'll = 1. Denote by 

V* = arg max i> ■ x (5) 

xev 

the outer face of V where A* lies. Given our assumptions on 
A*, it lies in the relative interior of V*. 

By v l we denote the subspace of R^ orthogonal to u. For 
any vector a, we denote by a* = (a • w) u its orthogonal pro¬ 
jection on the (one-dimensional) subspace spanned by v, and 
by a± = a — a* its orthogonal projection on the (N — 1 )- 
dimensional subspace . 



The following observations and notations will be useful. There 
is a 8 > 0 such that the entire set 

5a* — {y € V* : ||y — A*|| < 5}, (6) 

also lies in the relative interior of V*. 

6. BACKGROUND ON GENERAL-STATE- 
SPACE DISCRETE-TIME 
MARKOV CHAINS 

We will briefly discuss some notions and results from [ 8 ] and 
[4] on the stability of discrete time Markov Chains (MC), 
which will be used in later sections. Throughout this section 
we will assume that the Markov Chain $ = {<£>(0), <3>(1),...} 
is evolving on a locally compact separable metric space X 
whose Borel a- algebra will be denoted by B. P v and E v are 
used to denote the probabilities and expectations conditional 
on <£>o having distribution q, while P x and E x are used when 
q is concentrated at x. The transition function of $ is denoted 
by P(x, A), x £ X , A £ B. The iterates P*, t » 0,1,2,..., 
are defined inductively by 

p o a l p t a > i, 

where I is the identity transition function. 

Definition 3. (i) f-irreducibility: A Markov Chain $ = 
{<3?(0), <3>(1),...} is called (f> irreducible if there exists a finite 

OO 

measure <f> such that P k { x ,A) > 0 for all x £ X whenever 

k= 1 

4>{A) > 0. Measure f is called an irreducibility measure. 

(it) Harris Recurrence: If & is ([-irreducible and P x ($(t) £ 
A i.o.) = I whenever cj>(A) > 0, then 4? is called Harris recur¬ 
rent. [Abbreviation ’i.o.’ means ’infinitely often’.] 

(Hi) Invariant Measure: A a-finite measure n on B with the 
property 

it {A} = KP {A} = J n(dx)P(x,A),VA £ B, 
is called an invariant measure. 

(iv) Positive Harris Recurrence: If 4? is Harris Recurrent with 
a finite invariant measure n, then it is called positive Harris 
Recurrent. 

(v) Boundedness in Probability: If for any t > 0 and any x £ 
X, there exists a compact set D such that 

lim inf P4$(t) 6 D) > 1 — e, (7) 

t —>oo 

then the Markov process 4? is called bounded in probability. 

(vi) Small Sets: Aset C is called small if for all x £ C and some 
integer l > 1, we have 

P\x,■)>»{■), ( 8 ) 

where u(-) is a sub-probability measure, i.e. u(X ) < 1 . 

(vii) For a probability distribution a = (a i, 02 ■ ■ .) on {1, 2,...}, 
the Markov transition function K a is defined as 

OO 

K a E Y, a * P ’ - 

i= 1 


(viii) Petite Sets: Aset A £ B and a sub-probability measure 
on B(X) are called petite if for some probability distribution a 
on { 1 , 2 ,...} we have 

I< a (x, •) > ip(-),\/x £ A. 

(be) Non-evanescence: A Markov chain $ is called non-evanescent 
if Pai{& —>■ 00 } = 0 for each x £ X. [Event {4? —> 00 } consists 
of the outcomes such that the sequence <£>(t) visits any compact 
set at most a finite number of times.] 

The following proposition states some results from [ 8 ]. 

Proposition 4. (i) If a set A is small and for some proba¬ 
bility distribution a on {1, 2,...} and a set B £ B, we have 

inf K a (x, A) > 0, (9) 

then B is petite. 

(ii) Suppose that every compact subset of X is petite. Then 
$ is positive Harris recurrent if and only if it is bounded in 
probability. 

(iii) Suppose that every compact subset of X is petite. Then $ 
is Harris recurrent if and only if it is non-evanescent. 

The following result is form from [4]. It is stated in a form 
convenient for its application in this paper. 

Proposition 5. Let L(x) be a non-negative (Lyapunov) func¬ 
tion such that the Markov process $ satisfies the following two 
conditions, for some positive constants k , 8, D: 

(a) E [i(4>(f + 1 )) — L(<J>(t))|4>(t) = x] < —8, for any state x 
such that L(x) > k > 0. 

(b) \L(${t + l))-L(${t))\<D. 

Then there exist constants q > 0 and 0 < p < 1 such that 
P(L($(t)) > u | L($(0)) = 6) < 

1 _ t 

p 1 exp (r/(b — u)) H- D exp(ri(n — u)), u > 0. (10) 

1 ~ P 

7. QUEUE LENGTH PROCESS 

Recall that Q (n \-) is the queue length process for the n-th 
system under MaxWeight. In this section we prove that for all 
n, the process Q^ n \-) is positive Harris recurrent. The proof 
uses a Lyapunov drift argument which is fairly standard (in 
fact, there is more than one way to prove stability of (•)), 
except, since our state space is continuous, as a first step we 
will show that all compact sets are petite. 

Some simple preliminary observations are given in the follow¬ 
ing lemma. 

Lemma 6 . (i) The points x e R+, such that 
k € argmax^( 7 ai )-pf is non-unique, form a set of zero Lebesgue 
measure. Moreover, if x > 0 is such that k £ arg meex^x) ■ pf 
is unique, then for a sufficiently small e > 0 the decision k is 
also the unique element of arg max e (~/y) ■ pf for all y £ B e (x). 



(ii) The one-step transition function p( n \x,-) of the process 
Q (n \-) is such that, uniformly in n and x £ M+, the distribu¬ 
tion p( n \x, •) is absolutely continuous with the density upper 
bounded by 8 * and, in the rectangle [0, A max — S max ], lower 
bounded by 8 *. 

PROOF. Statement (i) easily follows from the finiteness of 
the set of decisions k. Statement (ii) easily follows from the 
assumptions on the arrival process distribution and the fact 
that S max < A max . □ 


Lemma 7. For any x > 0, there exists e > 0 such that the 
set B e (x) is small for the process Q^(-). 

Proof. Consider rectangle 

H = [x + (1/3 )(A max - S max ),x + ( 2/3)(A max - S max )\. 
Choose e > 0 small enough, so that e < (1/3) mini (A™ ax — 
S™ ax ) and t < mini x%. Then, B € (x) lies in the interior of R+ 
and every point in B t {x) is strictly smaller than any point in 
H. Lemma 6 (ii) implies that for any y £ B e (x), the distribu¬ 
tion P^(y, •) has a density lower bounded by <5* in H. □ 

Lemma 8. For the Markov process Q^(-), any compact set 
is petite. 

PROOF. Consider a compact set G C R+; of course, G is 
bounded. Fix arbitrary x > 0 and pick e > 0 small enough, so 
that B € (x) is small and lies in the interior of R+. Pick small 
S > 0 such that any point in {||j/|| < <5} is strictly less than any 
point in B f (x). 

It is easy to verify that there exists an integer r > 0 such that 
the following holds uniformly in Q^ n \ 0) £ G: 

P{||Q (n) ( t)|| < 5} > a for some a > 0. (11) 

Indeed, suppose first that for all t = 0,1,..., A(t) = 0. 
(This is a probability zero event, of course, but let’s consider it 
anyway.) Then, for any S 3 > 0 there exist 81,82 > 0, such that 
the following holds: with probability at least some <5i > 0, 
the norm ||Q^(t)|| decreases at least by some 82 > 0, at 
each time t when ||Q< n >(t)|| > S 3 > 0. This implies that for 
some r > 0 and <54 > 0, Q (n \ 0) € G implies -P{||Q^(t)|| < 
^ 3 } > (k. Now, using this and the fact that with a positive 
probability A^ n \t) can be “very close to 0,” we can easily 
establish property (11). (We omit rather trivial details.) 

Next, it is easy to show that there exists an integer n > 0 
such that the following holds uniformly in ||Q^”^( 0 )|| < 8 : 

P{\\Q (rl) (n)|| £ B £ (x)} > qi for some ai > 0. (12) 

Here we use Lemma 6 (ii), which shows that at each time step 
the distribution of the increments of Q^(-) has a density 
lower bounded by <5* in [0, A max - s max ]. 

From (11) and (12) we see that uniformly in Q (n \ 0) £ G, 
P{||Q(«) (r + ri )|| £ _B e (a;)} > aai. Application of The¬ 
orem 4(i) shows that G is petite (and, moreover, that it is 
small). □ 


To prove stability, we will apply Proposition 4 which requires 
the following 

Lemma 9. Consider the scalar projection Ws/yQ^ n \-)W,t = 

0,1,... of the the Markov process Q (n ' 1 starting with a fixed 
initial state Q (n \ 0), such that || % /7Q ( "1(0)|| = Then, uni¬ 
formly on all large n we have, 

£ > (ll\/7Q (n) (f)ll > u) < p t exp(r?(6 - it)) 

t 

+ —- D exp (ij(K — u )), u > 0, 

1 ~ P 

(13) 

for some constants r/,K,D> 0 and 1 > p > 0 which depend on 
n. Consequently, the process Q (n) (■) is bounded in probability. 

Proof. We will use notation L(x) = 11 ^/ 7 * 11 - Then 
L(Q (n) { 0)) = b. Clearly, \L(Q (n) (t + 1)) - L{Q (n \t))\ is 
uniformly bounded by a constant, given our assumptions on 
the arrival and service processes. We will show that the drift 
(average increment) of L(Q (n \t+l))) — L(Q^ n \t))) is upper 
bounded by some —8 < 0 when ||L(Q (,l ^(l))|| > k for some 
k > 0. 

Consider a fixed Q^ n \t) and denote AL = E[L(Q^(t-\ fl)) — 
L(Q (n) (t))]. Clearly, 

AL = E\\ y /yQ (n \t + l)W-\\^Q (n \t)\\ 

- 2HV70<->(t)ll + »»’ - l^'-’WI 1 ) • 

(14) 

where the inequality follows from the concavity of the func¬ 
tion y(r. Substitute the value of Q^ft+T) from equation (1), 
concentrate on the numerator of the above expression to ob¬ 
tain, 

E\\ViQ in) (t + i)\\ 2 -\\ViQ (n) m 2 
= E\\^Q (n \t) + V7 (A (n) (f) - s (n \t) + U (n \t)) f 
- \WiQ (n) m 2 

= E [|| - S (n \t) + t/ (n) (t))|| 2 

+2 (^7 ■ (V7 (^ (n) (*) - S (n \t) + E7 (n >(t)))] 
= E [\\^(A^\t) - S (n \t) + t/ (n) (t))|| 2 

+2 ( 7 Q (n) (f)) • (A {n) (t) - S (n \t ) + C/ ( 7 l) (f))] 

= E [||^7 (A {n \t) - S (n \t) + U {n \t ))\\ 2 
+2 • U (n) {t ) 

+2 ( 7 Q (n) (t)) ■ ( A (n) (t ) - S (n) (f))] 

< 61 + b 2 + 2E [( 7 Q (n) W) • (A w (t) - S (n) (t )) |Q (n) (f)] , 

(15) 

where bi is a uniform bound on \\ s /^(A ( - n \t) — S^ n \t) + 
U (rl \t)) || 2 , and &2 is a uniform bound on ||2 ^7 Q^(t)j ■ 

U^ n \t) || which follows from the property that Ui{t) > 0 only 
when Qi(t) is sufficiently small. 



To simplify exposition and avoid introducing additional nota¬ 
tion, let us assume that A (n ^ A* = —tv for some t > 0. (If 
not, then instead of A* in this proof we can use A**, which 
the orthogonal projection of A^ on V *.) Combining (14) 
and (15), we obtain 

2||V7Q (n) (i)l|AL< 61 + 62 

+ 2 E [( 7 Q w (t)) • (A {n) (f) - 

= 61+62 

+ 2E[(~,Q (n \t)) 

■ (A w (t)-A* + A*-S (n) (t))] 

= 6 i + 6 2 — 2 e|| ( 7 Q (n >(f))j| 

+ 2 E [( 7 Q (n) (t)) ' (A*-S (n >(t))] 

< 6i + 62 - 2e|| ( 7 Q (n) (6)) || 

- 6 || ( 7 Q (n) (t))j|, (16) 

where the last inequality follows from the definition of Max 
Weight (see (2)) and the set B x * (see ( 6 )). If || 7 Q <n )(t)|| > 
x, then at least one of || J| or || ( 7 Q (n >(f))j| is 

greater than or equal to x/\/2. After some algebraic manipu¬ 
lations we obtain ( 7 m i„ = min, 7 0 , 

\WlQ (n) {t)\\ > * =» || 7 Q (n) (i)|| > VTminai 

=+ s\\ ( 7 Q (n) W)^ II + ell ( 7 Q (n) W) ± II > (e A <5) 

Substituting the above in inequality (16) we see that the drift 
is upper bounded by 


- (e A 5) 


yj ^min^ 
2^2 


+ 


61 + 62 

nv+Q^wir 


This quantity is uniformly bounded by a negative constant for 
sufficiently large x. Application of Proposition 5 completes 
the proof. □ 


Now the positive recurrence of Q {n \-) follows from Proposi¬ 
tion 4. In fact, we will prove the following stronger statement. 


Theorem 10. For each n = 1,2,..., the Markov process 
Q (n )(-) is positive Harris recurrent and hence has a unique 
invariant probability distribution, which will be denoted x^ ■ 
Moreover, if Q (n \ 00 ) is the (random) process state in station- 
aiy regime (i.e. it has distribution x n> )> 

.E[||Q (n - ) (oo)|| r ] < 00, Vr > 0. 


Proof. By Lemma 8 any compact set is petite. Since Q^ (•) 
is also bounded in probability (Lemma 9), by Proposition 4 
Q^(-) is positive Harris recurrent. 


state (0). Since the process is positive Harris recurrent, 
we can apply the ergodic theorem to obtain (note that 7),|| • || 
is a bounded continuous function): 

1 m 

£(r fc |iQ t ">(oo)ir) = m i™ oo -E s [ Ti >iiQ c " ) ( t )ir] • 

m 00 n t —0 

(17) 


On the other hand, 

. m m 

hm i-^Eb||Q (n) (f)|rl < lim ± £ E [||Q (n) (t)||i 

m—t 00 771 z ' L J m—> 00 771 / L J 


<C, 


(18) 


for some constant C > 0, where the second inequality follows 
from (13). Combining (17) and (18), we have 

£(T t ||Q w (oo)|r) <C, V 6 > 0, (19) 

and therefore, by monotone convergence theorem, 

E (||Q (n >(oo)|r) = hm E (T 6 ||Q ( ”)(oo)|r) < C. 

□ 


Lemma 11. Uniformly on all (large) n and the distributions 
of Q (n \ 0 ), the distribution of Q (n ^(l) is absolutely continuous 
w.r.t. Lebesgue measure, with the density upper bounded by 5*. 


We omit the proof, which is straightforward, given our as¬ 
sumptions on the distribution of A^ n \t). 


Lemma 12. As n —» 00, ||Q^(oo)|| —»• 00 in probability. 

Proof. The proof is by contradiction. Suppose, for some 
fixed C > 0 the compact set D = {x £ R^ : ||a;|| < C} is 
such that 

limsupx (n) (T>) =/3 > 0. (20) 


Suppose Q( n \t) £ D. Then, using the same argument as in 
the proof of Lemma 8, it is easy to see that for any e > 0 there 
exists time r > 1, such that P{\\Q (n \t + r)|| < e} > /3i > 
0. This in turn implies that, with probability at least some 
P 2 > 0, for at least one flow i the amount of wasted service 
(t + r) > e 2 > 0. This implies that, for at least one i, 

limsup.E[?7 i (n \oo)] > /5i/?2£2 > 0. 

n—> 00 

This, however, contradicts the fact that the process is stable 
for all large n. □ 

8. STEADY-STATE QUEUE LENGTHS 
DEVIATIONS FROM „ 

Let us consider the process T^ n )(-), defined as 

Y (n \t) := ( 7 Q W (1))i. 


For a function /(•) and fixed 6 > 0, denote T b ff) = /(•) A 6 . Lemma 13. The steady-state expectednorm £||'K ( " ) (oo)|| is 

Consider the process starting from an arbitrary fixed initial uniformly bounded in n. 



Proof. As we did in the proof of Lemma 9, to simplify 
exposition, assume that A < - n ' ) — A* = — tv. (If not, in this 
proof we would consider the projection A** of A^ on V*, in¬ 
stead of A*. Consider Lyapunov function L(Q) = ill 7 iQi- 
By Theorem 10, EL(Q (n \ oo)) < oo. The conditional drift 
of L(Q) in one time step is given by (let Q^(t) = Q (n \ 
A^ n \t) = A^ n \ and so on, to simplify notation) 


= E 


= E 


E [i(Q (n) (f + 1)) - T(Q (n) (f))|Q (n) ] 

' N 2 

Y 7i (oi n) + ^ - s\ n) + ul n) y IQ W 

,i= 1 

N 0 

2> («!">) 
i=l 
' JV 

5>(4 B) -S< n) + £tf n) ) ( 2 Ql n) 

i= 1 

+4, (n) -S< n) +(/< n) )|Q(„ ) ] 


= E 


£-»(4 


(«) _ <j(«) _j_ ij( n P 


+2 7i Q i ( ” ) (A< n) -S< n > + tf< n) )|Q<">] 
<6i + 2 ( 7 Q (n) ) • (a ( b) - E (s (n) |Q (n) )) 

= 6i + 2 (yQ (n) ) • (A (n) - A* + A* - E (s (n) |Q (n) )) 
= 61 — 2 e|| ( 7 Q w )j| 

+ 2 ( 7 Q (n) ) • (A *-E (s (r °|Q (n) )) 

<bi- 2e|| (rQ (n) )^ II + 2 min • (A* - y) 


<bi- 2e|| ( 7 Q (n) )^ II - 2<5|| ( 7 Q) ± II, (21) 

where 61 depends only on 7, A max , S max , and the last in¬ 
equality follows from the definition of MaxWeight and B x *. 
Now consider the process Q (n> (-) in stationary regime, and 
take the expectation of both parts of (21). We obtain. 


28E [|| ( 7 Q ( " ) (oo))^||] + 2 tE [|| ( 7 Q (n) (oo))^ ||] < bi. 

( 22 ) 


Recalling that yQ (n ' > ( 00 )^ = Y (n ' > ( 00 ), we see that 

£||r (n) (oo)|| is uniformly bounded. □ 


9. LIMIT OF THE QUEUE-DIFFERENTIAL 
PROCESS 

We now define a Markov chain Y*(-), which, in the sense 
that will be made precise later, is a limit of the (non-Markov) 
process as n —> 00 . 


corresponding projection of the steady-state Q (n \oo), and by 
r (n) its distribution. 

Markov chain V*(-) is defined formally as follows. (We will 
show below that, in fact, the distribution U n) converges to 
the stationary distribution F* of Y’*(-).) The state space of 
y*(-) is v±. Assume that at time t the "scheduler" chooses 
decision 

k £ arg ma,x(Y*(t)) ■ y , (23) 

which determines the corresponding random amount of ser¬ 
vice S(t), provided to the "queues" given by vector Q*(t) = 
Y* ( t )/ 7 . After that the (random) amount A* ( t ) of new "work" 
arrives and is added to the "queues." Finally, the new queue 
lengths vector Q*(t)—S(t)+A*(t) is transformed into Y*(t+ 
1 ) via componentwise multiplication by 7 and orthogonal pro¬ 
jection on v±. (Note that both Q*(t) and Y*(t) may have 
components of any sign. Also, there is no "wasted service" 
here.) In summary, the one step evolution is described by 

Y*(t + 1) = Y*(t) + (7 A*(t) - 7 S(t)) ± . (24) 

Informally, one can interpret the process Y*(-) as the queue- 
differential process Y^"\-), when n is very large and the 
queue length vector Q is both large and has a small an¬ 
gle with v. Under these conditions, the only service decisions 
k that can be chosen are such that y k £ V*, and the choice is 
uniquely determined by Y^ (•). 

Let P(x, •) denote the one-step transition function for the 
Markov process Y*(-). If a: £ v±, then let B e (x) := {y £ 
v± : 11 y — x 11 < e}. The following fact is analogous to 
Lemma 6 . 


Lemma 14. (i) The points y £ v±, such that 

k £ arg max y ■ y 1 (25) 

is non-unique, form a set of zero Lebesgue measure. Moreover, if 
y is such that the corresponding decision k is unique, then for a 
sufficiently small e > 0 the decision k is also the unique element 

of 

1 

arg max z • y 

for all z £ B f (y). 

(ii) There exist small e > 0 and constant c» > 0, c* > 0 such 
that P(x,-) is absolutely continuous and, moreover, uniformly 
in x € v x, the density of P(x , •) is lower bounded by c* on set 
B e (x) and is upper bounded by c* everywhere. 

Proof. Statement (i) is obvious. Statement (ii) follows 
from our assumptions on the distribution of A*(t), the fact 
that A max > S max , and the one-step evolution rule (24). We 
omit details. □ 


Define Y' (n) (t) as the orthogonal projection of yQ^ft) on 
the subspace v±. We call Y ^(-) a queue-differential process. 
(Obviously, under the CRP condition, the queue-differential 
process is equal to the “queue deviation” process Y (n \-) = 
(lQ (n) (t))± in Section 8 . When CRP does not hold, the “de¬ 
viation” and “differential” processes are defined differently. 
This will be discussed in Section 10.) Denote by Y^ n )(oo) the 


Lemma 15. For the Markov chain Y*(-), every compact set 
is petite. 

The proof easily follows from Lemma 14, by using the argu¬ 
ment analogous to that in the proof of Lemma 8 . We omit 
details. 



Next, we establish some properties of a stationary distribution 
T* of the Markov process Y’*(-), assuming a stationary distri¬ 
bution exists. This will help us later prove that the stationary 
distribution in fact exists and is unique. 

Lemma 16. If T* is a stationaiy distribution of Y*(-), then 
T* is equivalent to the Lebesgue measure L, i.e. T* <g C and 

r«r*. 

Proof. T* <C £: This follows from Lemma 14. 

C <C T*: It suffices to show that T*(B r (z)) > 0 for any 
z £ i/± and r > 0. Consider the process Y*(-) with the 
distribution of y*(0) equal to F*. (Then the process is of 
course stationary.) Fix any 0 < /3 < 1 and choose a compact 
set D C i/± such that T* ( D) > (3. Using Lemma 14 we can 
easily show that there exists time r > 0 and a constant A > 0, 
such that, uniformly in Y*( 0) = x £ D, 

P{Y*(t ) £ B r (z) | y*(0) = x} > A, 

and therefore 

r *(B r (z)) >/3A>0. 

□ 

Lemma 17. SupposeT* is a stationary distribution of Y*(-). 
Then P X (Y* —t oo) = 0, T* — a.s., and hence P x (Y*(t) —> 
oo ) = 0, t — a.s.. 

Proof. The proof is by contradiction. Let y*(0) have the 
stationary distribution F*, and assume that 3 e > 0, ei > 0 
such that 

r*({a; : P X (Y* —¥ oo) > ei}) > e. 

This would imply that lim sup P(Y*(t) £ D) < 1 — eei for 

t—>oo 

every compact set D C is±. This is impossible, because the 
distribution of Y*(t) is equal to T* for ah t. □ 

Lemma 18. If process Y* (•) has a stationary distribution, it 
is non-evanescent. 

Proof. Consider process Y*f) with fixed initial state 
y*(0) = x. Consider one-step transition. The distribution 
of y*(l) is absolutely continuous with respect to C. Thus, 
by Lemma 17, with probability 1 , z = Y*(l) is such that 

P Z (Y* —> oo ) = 0. Then, P X (Y* -> oo) = 0. □ 

Lemma 19. SupposeT* is a stationary distribution of Y*(-). 
Then, the Markov chain is positive Harris recurrent, and there¬ 
fore T* is its unique stationaiy distribution. 

Proof. Since every compact set is petite (Lemma 15) and 
the process is non-evanescent (Lemma 18), it is Harris re¬ 
current by Proposition 4. But since it has a finite invariant 
measure F*, Y*(-) is positive Harris recurrent. □ 

We now show the existence of a stationary distribution of 

Y m (-). 


Lemma 20. .Every weak limit point of the sequence of 
distributions F^ is a stationary distribution of the process 

yW(.). 

Proof. Let T* be a weak limit of along a subsequence 
on n. We can make the following observations. 

(a) Observe that uniformly on all (large) n and the distri¬ 
butions of Q^ n \ 0), the distribution of V^(l) is absolutely 
continuous w.r.t. Lebesgue measure, with the upper bounded 
density. (This easily follows from Lemma 11 and the fact that 
||qO)(i) _ Q (n )(0)|| is uniformly bounded.) Then, we see 
that T* is absolutely continuous with bounded density. 

(b) Consider any point y £ such that the decision k in 
(25) is unique and a small t > 0 such that this decision k 
is also unique for all z £ B e (y). (See Lemma 14(i).) Then, 
there exists a sufficiently large C > 0 such that, uniformly in 
n, conditions ||Q^(f)ll > C and Y^(t) £ B e (y) imply that 
the same decision k will be unique at time t for the process 

Q^O- 

Using these two observations. Lemma 12, and the fact that 
the distribution of A ( ”) (t) converges to that of A*(t), we can 
choose a further subsequence of n along which the following 
property holds. The stationary versions of processes Q^(-) 
and the process Y* (•) with distribution of Y* (0) equal to r*, 
can be constructed on one common probability space, so that 
with probability 1: 

(c) for ah large n, the same decision k is chosen at time 0 in 
the processes (.) an d Y*(-); 

(d) F ( ">(0) -4 Y*( 0) and Y' (n) (l) -> V*(l). 

This, in turn, implies that for any bounded continuous func¬ 
tion g we have, 

E[g(Y*(0))}= lim E L(y^(0))] , 

E[g(Y*( 1))] = lim £[ 9 (yW(l))l . 

n—> oo L J 

But, E [jfyWfO))] = E [ fl CK (rl) (l))j for all n. Therefore, 

E [g(l r *(0))] = E [g(l^*(l))]. This proves stationarity of T*. □ 

Theorem 21. The Markov process Y*(-) is positive Harris 
recurrent. The sequence F-"' 1 [i.e., the distributions ofY (n \oo)] 
weakly converges to the unique stationary distribution F* of 
Y*(-). 

Proof. This follows from Lemma 20 and Lemma 19. □ 

We are finally in position to give a 

Proof of Theorem 2. By Theorem 21, the process Y*(-) 
is positive Harris recurrent. Moreover, we know that it is such 
that every compact set is petite. We can pick any compact set 
D such that T*(D) > 0, and using Nummelin splitting view 
the process Y*(-) as having an atom state, with finite average 
return time to this atom. We see that the cumulative "service 
process" G*(-) corresponding to Y*(-) in steady-state is such 
that 

max lim P(G*(T) = 0) = 0. 

i T —¥oo 



Finally, the argument used in the proof of Lemma 20 shows 
that the stationary versions of processes Y*(-) and Q ln \-) 
for all (large) n can be constructed on a common probability 
space in a way such that, w.p.l, for any T > 0 

G W (T) -» G*(T). 

This implies (4). □ 

10. GENERALIZATION TO THE CASE 
WHEN CRP CONDITION DOES NOT 
NECESSARILY HOLD 

If CRP condition does not necessarily hold, let v denote the 
normal cone to V at point A*; it has dimension d > 1. (In 
the CRP case, d = 1 and u is a ray.) Fix any positive vector 
v' which lies in the relative interior of u. Then, V* is defined 
more generally as 

V* = argmaxi/ • x; 

it is a (N — d) -dimensional face of V. By we denote the 
(N — d) -dimensional subspace orthogonal to u. 

We will denote by cc* the projection of a vector x on the nor¬ 
mal cone v\ that is, a;* is the closest to x point of v. Then let 
x± = x — x+, and let x± tSp be the orthogonal projection of x 
on the subspace is±. Note the difference between the defini¬ 
tions of x± and x±, sp . (In the CRP case, x± = x±, sp . In the 
non-CRP case they are in general different.) We always have 
||*x,sp|| < ||as_L ||- Note that, if x* lies in the relative interior of 
v, then x * = x ±}sp . 

In this notation, the entire development in Sections 7 and 8 is 
carried out essentially as is, with very minor adjustments. 

The development in Section 9 is carried out with small adjust¬ 
ments, which are as follows. The queue differential process 
is defined as Y^(f) = (~fQ < ' n ^(t))±, sp . Correspondingly, the 
one step evolution of Y*(-) is defined by (23) and 

Y*(t + 1) = Y*{t) + (7 A*{t) - 7 S(t)) ±tSp . 


The statement of Theorem 21 and the proof of Theorem 2 

remain unchanged. 
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Therefore, the state space for both Y* (n ^(-) and Y*(-) is 

The proof of the key Lemma 20 requires, in addition to Lemma 12, 
the following Lemma 22. Let h(x) denote the distance from 
a;* to the relative boundary of the cone u. (To be precise, 
h(x) is defined as the distance from cc* to the set {relative 
boundary on the cone v} \ {boundary of the positive orthant 

R+}■) 


Lemma 22. As n —> oo, h{Q ( ' n \ oo) — > oo in probability. 


This lemma is easily proved, because the contrary, along with 
Lemmas 20 and Lemma 13, would imply that the frequency of 
choosing scheduling decisions outside V* would not vanish, 
asn-> oo; that would contradict stability when n is large. 

Then, in the proof of Lemma 20, in the statement (b), the 
condition ||Q <n) (t)|| > C is replaced by hfQ^ft)) > C; also. 
Lemma 22 is used along with Lemma 12. 



